Sizing Applications for Solaris - Part 2

By Adrian Cockcroft, Staff Engineer, SMCC Product Marketing

This article is the second in a two-part series focusing on how to correctly size applications for the Solaris 2 operating environment, and how to avoid some of the most common problems and pitfalls inherent in the sizing process. Part I, which appeared in the Spring 1995 issue of Catalyst Flash, outlined how to choose the correct workload for your application. Part II discusses how to collect and evaluate test results given a particular workload. The information in this series is summarized from the recently published book, Sun Performance and Tuning: SPARC and Solaris, and the 1993 whitepaper, Sun Performance Tuning Overview White Paper, both by Adrian Cockcroft.

In general, when you tune an application, you focus on measuring the application processes and adjusting them to reduce their resource consumption. However, when you size an application, you focus, instead, on the system as a whole as you work out what type of processor, I/O and memory configurations your application requires for a range of common end- user workloads. Unfortunately, application sizing is often an afterthought, or is not performed at all. However, if you plan from the start, you can avoid some common pitfalls. This article should help to point you in the right direction.

Measuring the Workload on a System

At this point, you have decided which tests you wish to run on your system. Now, you need to know what measurements to make, and what to look for in those measurements. In the remainder of this article, I will outline how to set up data collection. I'll also cover the most common sizing issues, so you should be able to tell when the system is short of disks, network bandwidth, memory, or CPU power.

Using Accounting to Monitor the Workload

If you have access to a group of real end-users over a long period of time, then enable the UNIX system accounting logs. This can be useful on a network of workstations as well as on a single time-shared server. From this you can identify how often programs run, how much CPU time, disk I/O, and memory each program uses, and what work patterns look like throughout the week. To enable accounting, enter the three commands shown at the start of Figure 1. Also refer to the section "Administering Security, Performance, and Accounting in Solaris 2" in the Solaris System Administration Answerbook and see the acctcom command. You must also add crontab entries to summarize and checkpoint the accounting logs. Collecting and checkpointing the accounting data puts a negligible additional load onto the system, but the summary scripts that run once a day or once a week can have a noticeable effect, so you should schedule them to run after hours.

Figure 1	How to Start System Accounting in Solaris 2

# ln /etc/init.d/acct /etc/rc0.d/K22acct

# ln /etc/init.d/acct /etc/rc2.d/S22acct

# /etc/init.d/acct start

Starting process accounting

# crontab -l adm

#ident  "@(#)adm        1.5     92/07/14 SMI"   /* SVr4.0 1.2   */

#min    hour    day     month   weekday

0       *       *       *       *       /usr/lib/acct/ckpacct

30      2       *       *       *       /usr/lib/acct/runacct 2> \

 /var/adm/acct/nite/fd2log

30      9       *       *       5       /usr/lib/acct/monacct

Collecting Long-term System Utilization Data

As a matter of course, collect overall system utilization data on all the machines you deal with. This facility already exists for Solaris 2; simply uncomment the entry for the sys user in the crontab file.

The Solaris 2 utilization log consists of a sar binary log file taken at 20-minute intervals throughout the day and saved in /var/adm/sa/saXX, where XX is the day of the month. This collects a utilization profile for an entire month. You should save the monthly records for future comparisons. When a performance- related problem occurs, it is far easier to identify the source of the problem if you have measurements from a time when the problem was not present. Remember that the real-life user workload is likely to increase over time. You should try to produce a plot to look for long-term utilization trends.

An example crontab file is shown in Figure 2. Note that sar does not collect network-related information.

Figure 2	crontab Entry for Long-term sar Data Collection

# crontab -l sys

#ident	"@(#)sys	1.5	92/07/14 SMI"	/* SVr4.0 1.2	*/

#

# The sys crontab should be used to do performance collection. See cron

# and performance manual pages for details on startup.

#

0 * * * 0-6 /usr/lib/sa/sa1

20,40 8-17 * * 1-5 /usr/lib/sa/sa1

5 18 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 1200 -A

Interpreting the Measurements

Once you have measured the performance of your application and the utilization of the system on which it is running, you have a problem. The numbers reported by commands like vmstat, iostat, netstat and are mostly useful for debugging and tuning the Solaris kernel itself. They are poorly documented, the meaning and expected behavior of each metric is not clear, and the behavior changes without notice between Solaris releases. If you have UNIX systems from other vendors, you will find that there is minimal consistency between UNIX implementations.

Just because two different systems print a column of numbers from vmstat with the same heading, you cannot assume that the same virtual memory algorithm and parameters underlies those measurements. There is some work underway to produce an X/Open standard for performance measurements, but current standards only define the interface. The implementation and performance of a standardized interface can vary.

I have tried to document the meaning of most of these measurements for both Solaris 1 and Solaris 2 in my 1993 white paper, Sun Performance Tuning Overview White Paper and my recent book, Sun Performance and Tuning: SPARC and Solaris. Within the limited space of this article, I will try to provide enough information to help you determine where overloading may have occurred: disks, network, available RAM, or CPUs.

The system will often have a disk bottleneck.

In many cases the most serious bottleneck is an overloaded or slow disk. Use iostat -x 30 to look for disks that are more than 30 percent busy and have service times of more than 50 ms. The service time is key; this is the time between a user process issuing a read and the read completing (for example), so it is often in the critical path for response time. If many other processes are also accessing that disk, a queue can form, and service times of over 1000 ms (not a misprint, over one second!) can easily occur as you wait to get to the front of the queue. A service time of 15 to 20ms on active disks is healthy.

After first pass tuning, the system will still have a disk bottleneck!

Keep checking iostat -x 30 as tuning progresses. When a bottleneck is removed, the system may start to run faster, and as more work is done, some other disk will overload. At some point you may need to stripe filesystems and tablespaces over multiple disks.

Poor NFS response times may be hard to see.

Waiting for a network-mounted filesystem to respond is not counted in the same way as waiting for a local disk. The system will appear to be idle when it is really in a network I/O wait state. Use nfsstat -m to find out which NFS« server is likely to be the problem, and tune its disks. You should look at the NFS operation mix with nfsstat on both the client and server and, if writes are common or the server's disk is too busy, configure a PrestoServe or NVSIMM non-volatile write cache in the server. See that you do not overload the Ethernet by checking that the collision rate is low. If collisions are above five percent, try to split up the network or replace it with something faster. For more information, see SMCC NFS Server Performance and Tuning Guide on the Solaris 2.4 SMCC Hardware AnswerBook« CD.

Avoid the common memory usage misconceptions

When you look at vmstat free, please don't waste time worrying about where all the RAM has gone. After a while, the free list will stabilize around one sixteenth of the total memory configured, or at 1MB or less, depending upon the Solaris release and kernel architecture in use. The system stops bothering to reclaim memory from the file cache above this level, even when you aren't running anything. This is normal behaviour.

Don't panic when you see page-ins and page-outs in vmstat.

These are normal since all filesystem I/O is done using the paging process. Hundreds or thousands of Kbytes paged in and paged out are not a cause for concern, just a sign that the system is working hard.

Use page scanner activity as your RAM shortage indicator.

When you really are short of memory, the scanner will run continuously at a high rate (over 200 pages/second averaged over 30 seconds). If it runs in separated high-level bursts, you should try patching slowscan to 100 so that the bursts become longer and slower.

Look for a long run queue (vmstat procs r).

If, in a multiuser system, the run queue or load average is more than four times the number of CPUs, then processes wait too long for a slice of CPU time. This waiting can increase the interactive response time seen by users. Add more or faster CPUs to the system.

Look for processes blocked waiting for I/O (vmstat procs b).

This is a sign of a disk bottleneck. If the number of processes blocked approaches or exceeds the number in the run queue, check the disks. If you are running database batch jobs, it is OK to have some blocked processes, but you can increase batch throughput by removing disk bottlenecks.

Check for CPU system time dominating user time.

If there is more system time than user time and the machine is not an NFS server, you may have a problem. To find out the source of system calls, use the truss command. To look for high interrupt rates and excessive mutex contention, use the mpstat command. If the smtx column on a multiprocessor system is more than 200 per CPU, and there is a coincident increase in system CPU time for that CPU, then you have probably reached the limit of MP scalability for that workload/OS release combination. A subsequent OS release is likely to scale better.

Additional Information

For additional information, you can refer to the following publications:
  1. SMCC NFS Server Performance and Tuning Guide

    This is on the Solaris 2.4 SMCC Hardware 11/94 CD in the hardware specific AnswerBook« manual.

  2. Sun Performance and Tuning: SPARC and Solaris, Adrian Cockcroft, SunSoft Press/Prentice Hall, January 1995. ISBN 0-13-149642-3.

    For more information try http://www.sun.com/smi/ssoftpress/index.html.

  3. Sun Performance Tuning Overview White Paper, Adrian Cockcroft, December 1993.

    This paper is available for ftp from most of the SunSite servers. Look for the file SunPerfOvDec93.ps.Z A Japanese translation was published as volume 10 of the Nihon Sun "Expert" series. 4. The Art of Computer Systems Performance Analysis, Raj Jain, 1992, Wiley, ISBN 0-471-50336-3.